{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Linear Regression\n",
    "\n",
    "Welcome to your first assignment. This exercise gives you a brief introduction to linear regression. The exercise is to be implemented in Python. Even if you've used Python before, this will help familiarize you with functions we'll need.  \n",
    "\n",
    "**Instructions:**\n",
    "- You will be using Python 3.\n",
    "- Avoid using for-loops and while-loops, unless you are explicitly told to do so.\n",
    "- Do not modify the (# GRADED FUNCTION [function name]) comment in some cells. Your work would not be graded if you change this. Each cell containing that comment should only contain one function.\n",
    "- After coding your function, run the cell right below it to check if your result is correct.\n",
    "\n",
    "**After this assignment you will:**\n",
    "- Be able to implement linear regression model using statsmodels, scikit-learn, and tensorflow\n",
    "- Work with simulated non-linear dataset\n",
    "- Compare model performance (quality of fit) of both models\n",
    "\n",
    "Let's get started!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## About iPython Notebooks ##\n",
    "\n",
    "iPython Notebooks are interactive coding environments embedded in a webpage. You will be using iPython notebooks in this class. You only need to write code between the ### START CODE HERE ### and ### END CODE HERE ### comments. After writing your code, you can run the cell by either pressing \"SHIFT\"+\"ENTER\" or by clicking on \"Run Cell\" (denoted by a play symbol) in the upper bar of the notebook. \n",
    "\n",
    "We will often specify \"(≈ X lines of code)\" in the comments to tell you about how much code you need to write. It is just a rough estimate, so don't feel bad if your code is longer or shorter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import numpy as np\n",
    "\n",
    "import sys\n",
    "sys.path.append(\"..\")\n",
    "import grading\n",
    "\n",
    "try:\n",
    "    import matplotlib.pyplot as plt\n",
    "    %matplotlib inline\n",
    "except: pass\n",
    "\n",
    "import pandas as pd\n",
    "\n",
    "import tensorflow as tf\n",
    "from tensorflow.python.layers import core as core_layers\n",
    "try:\n",
    "    from mpl_toolkits.mplot3d import Axes3D\n",
    "except: pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "### ONLY FOR GRADING. DO NOT EDIT ###\n",
    "submissions=dict()\n",
    "assignment_key=\"QNZTAPW2Eeeg_w5MCivhhg\" \n",
    "all_parts=[\"dtA5d\", \"2inmf\", \"FCpek\",\"78aDd\",\"qlQVj\"]\n",
    "### ONLY FOR GRADING. DO NOT EDIT ###"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "COURSERA_TOKEN = \" \" # the key provided to the Student under his/her email on submission page\n",
    "COURSERA_EMAIL = \" \" # the email"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def reset_graph(seed=42):\n",
    "    \"\"\"\n",
    "    Utility function to reset current tensorflow computation graph\n",
    "    and set the random seed \n",
    "    \"\"\"\n",
    "    # to make results reproducible across runs\n",
    "    tf.reset_default_graph()\n",
    "    tf.set_random_seed(seed)\n",
    "    np.random.seed(seed)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## We use artificial data for the following two specifications of regression:\n",
    "\n",
    "### Linear Regression\n",
    "\n",
    "$ y(x) = a + b_1 \\cdot X_1 + b_2 \\cdot X_2 + b_3 \\cdot X_3 + \\sigma \\cdot \\varepsilon $ \n",
    "\n",
    "where $ \\varepsilon \\sim N(0, 1) $ is a Gaussian noise, and $ \\sigma $ is its volatility, \n",
    "with the following choice of parameters:\n",
    "\n",
    "$ a = 1.0 $\n",
    "\n",
    "$ b_1, b_2, b_3 = (0.5, 0.2, 0.1) $\n",
    "\n",
    "$ \\sigma = 0.1 $\n",
    "\n",
    "$ X_1, X_2, X_3 $ will be uniformally distributed in $ [-1,1] $\n",
    "\n",
    "### Non-Linear Regression\n",
    "\n",
    "$ y(x) = a + w_{00} \\cdot X_1 + w_{01} \\cdot X_2 + w_{02} \\cdot X_3 + + w_{10} \\cdot X_1^2 \n",
    "+ w_{11} \\cdot X_2^2 + w_{12} \\cdot X_3^2 +  \\sigma \\cdot \\varepsilon $ \n",
    "\n",
    "where\n",
    "\n",
    "$ w = [[1.0, 0.5, 0.2],[0.5, 0.3, 0.15]]  $\n",
    "\n",
    "and the rest of parameters is as above, with the same values of $ X_i $"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "### Generate Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "((7500, 3), (7500, 1))"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def generate_data(n_points=10000, n_features=3, use_nonlinear=True, \n",
    "                    noise_std=0.1, train_test_split = 4):\n",
    "    \"\"\"\n",
    "    Arguments:\n",
    "    n_points - number of data points to generate\n",
    "    n_features - a positive integer - number of features\n",
    "    use_nonlinear - if True, generate non-linear data\n",
    "    train_test_split - an integer - what portion of data to use for testing\n",
    "    \n",
    "    Return:\n",
    "    X_train, Y_train, X_test, Y_test, n_train, n_features\n",
    "    \"\"\"\n",
    "    \n",
    "    # Linear data or non-linear data?\n",
    "    if use_nonlinear:\n",
    "        weights = np.array([[1.0, 0.5, 0.2],[0.5, 0.3, 0.15]])\n",
    "    else:\n",
    "        weights = np.array([1.0, 0.5, 0.2])\n",
    "        \n",
    "\n",
    "    \n",
    "    bias = np.ones(n_points).reshape((-1,1))\n",
    "    low = - np.ones((n_points,n_features),'float')\n",
    "    high = np.ones((n_points,n_features),'float')\n",
    "        \n",
    "    np.random.seed(42)\n",
    "    X = np.random.uniform(low=low, high=high)\n",
    "    \n",
    "    np.random.seed(42)\n",
    "    noise = np.random.normal(size=(n_points, 1))\n",
    "    noise_std = 0.1\n",
    "    \n",
    "    if use_nonlinear:\n",
    "        Y = (weights[0,0] * bias + np.dot(X, weights[0, :]).reshape((-1,1)) + \n",
    "             np.dot(X*X, weights[1, :]).reshape([-1,1]) +\n",
    "             noise_std * noise)\n",
    "    else:\n",
    "        Y = (weights[0] * bias + np.dot(X, weights[:]).reshape((-1,1)) + \n",
    "             noise_std * noise)\n",
    "    \n",
    "    n_test = int(n_points/train_test_split)\n",
    "    n_train = n_points - n_test\n",
    "    \n",
    "    X_train = X[:n_train,:]\n",
    "    Y_train = Y[:n_train].reshape((-1,1))\n",
    "\n",
    "    X_test = X[n_train:,:]\n",
    "    Y_test = Y[n_train:].reshape((-1,1))\n",
    "    \n",
    "    return X_train, Y_train, X_test, Y_test, n_train, n_features\n",
    "\n",
    "X_train, Y_train, X_test, Y_test, n_train, n_features = generate_data(use_nonlinear=False)\n",
    "X_train.shape, Y_train.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linear Regression with Numpy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: numpy_lin_regress\n",
    "def numpy_lin_regress(X_train, Y_train):\n",
    "    \"\"\"\n",
    "    numpy_lin_regress - Implements linear regression model using numpy module\n",
    "    Arguments:\n",
    "    X_train  - np.array of size (n by k) where n is number of observations \n",
    "                of independent variables and k is number of variables\n",
    "    Y_train - np.array of size (n by 1) where n is the number of observations of dependend variable\n",
    "    \n",
    "    Return:\n",
    "    np.array of size (k+1 by 1) of regression coefficients\n",
    "    \"\"\"\n",
    "    ### START CODE HERE ### (≈ 3 lines of code)\n",
    "    \n",
    "    # number of features\n",
    "    ndim = X_train.shape[1] \n",
    "    \n",
    "    # add the column of ones\n",
    "    X_train = np.hstack((np.ones((X_train.shape[0], 1)), X_train))\n",
    "    theta_numpy = np.linalg.inv(X_train.T.dot(X_train)).dot(X_train.T).dot(Y_train)\n",
    "\n",
    "    \n",
    "    # default answer, replace this\n",
    "    # theta_numpy = np.array([0.] * (ndim + 1)) \n",
    "    ### END CODE HERE ###\n",
    "    return theta_numpy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Submission successful, please check on the coursera grader page for the status\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "array([ 0.99946227,  0.99579039,  0.499198  ,  0.20019798])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### GRADED PART (DO NOT EDIT) ###\n",
    "theta_numpy = numpy_lin_regress(X_train, Y_train)\n",
    "part_1 = list(theta_numpy.squeeze())\n",
    "try:\n",
    "    part1 = \" \".join(map(repr, part_1))\n",
    "except TypeError:\n",
    "    part1 = repr(part_1)\n",
    "submissions[all_parts[0]]=part1\n",
    "grading.submit(COURSERA_EMAIL, COURSERA_TOKEN, assignment_key,all_parts[:1],all_parts,submissions)\n",
    "theta_numpy.squeeze()\n",
    "### GRADED PART (DO NOT EDIT) ###"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linear Regression with Sklearn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: sklearn_lin_regress\n",
    "def sklearn_lin_regress(X_train, Y_train):\n",
    "    \"\"\"\n",
    "    Arguments:\n",
    "    X_train  - np.array of size (n by k) where n is number of observations \n",
    "                of independent variables and k is number of variables\n",
    "    Y_train - np.array of size (n by 1) where n is the number of observations of dependend variable\n",
    "    \n",
    "    Return:\n",
    "    np.array of size (k+1 by 1) of regression coefficients\n",
    "    \"\"\" \n",
    "    ### START CODE HERE ### (≈ 3 lines of code)\n",
    "    # use lin_reg to fit training data\n",
    "    ndim = X_train.shape[1] \n",
    "    ### START CODE HERE ### (≈ 3 lines of code)\n",
    "    try:\n",
    "        from sklearn.linear_model import LinearRegression\n",
    "    except ImportError:\n",
    "        raise(\"ImportError: No module named sklearn.linear_model found\")\n",
    "    X_train = np.hstack((np.ones((X_train.shape[0], 1)), X_train))\n",
    "    reg = LinearRegression(fit_intercept=False)\n",
    "    reg.fit(X_train, Y_train)\n",
    "    theta_sklearn = reg.coef_\n",
    "\n",
    "    ### END CODE HERE ###\n",
    "    return theta_sklearn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Submission successful, please check on the coursera grader page for the status\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "array([ 0.99946227,  0.99579039,  0.499198  ,  0.20019798])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### GRADED PART (DO NOT EDIT) ###\n",
    "theta_sklearn = sklearn_lin_regress(X_train, Y_train)\n",
    "part_2 = list(theta_sklearn.squeeze())\n",
    "try:\n",
    "    part2 = \" \".join(map(repr, part_2))\n",
    "except TypeError:\n",
    "    part2 = repr(part_2)\n",
    "submissions[all_parts[1]]=part2\n",
    "grading.submit(COURSERA_EMAIL, COURSERA_TOKEN, assignment_key,all_parts[:2],all_parts,submissions)\n",
    "theta_sklearn.squeeze()\n",
    "### GRADED PART (DO NOT EDIT) ###"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linear Regression with Tensorflow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: tf_lin_regress\n",
    "def tf_lin_regress(X_train, Y_train):\n",
    "    \"\"\"\n",
    "    Arguments:\n",
    "    X_train  - np.array of size (n by k) where n is number of observations \n",
    "                of independent variables and k is number of variables\n",
    "    Y_train - np.array of size (n by 1) where n is the number of observations of dependend variable\n",
    "    \n",
    "    Return:\n",
    "    np.array of size (k+1 by 1) of regression coefficients\n",
    "    \"\"\"\n",
    "    ### START CODE HERE ### (≈ 7-8 lines of code)\n",
    "    # add the column of ones\n",
    "    # define theta for later evaluation\n",
    "    ndim = X_train.shape[1] \n",
    "    ### START CODE HERE ### (≈ 7-8 lines of code)\n",
    "    X_train = np.hstack((np.ones((X_train.shape[0], 1)), X_train))\n",
    "    X = tf.constant(X_train, dtype=tf.float32, name=\"X\")\n",
    "    Y = tf.constant(Y_train, dtype=tf.float32, name=\"Y\")\n",
    "    XT = tf.transpose(X)\n",
    "    theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), Y)\n",
    "    ### END CODE HERE ###\n",
    "    with tf.Session() as sess:\n",
    "        theta_value = theta.eval()\n",
    "    return theta_value"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Submission successful, please check on the coursera grader page for the status\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "array([ 0.99946201,  0.99579054,  0.49919799,  0.20019798], dtype=float32)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### GRADED PART (DO NOT EDIT) ###\n",
    "theta_tf = tf_lin_regress(X_train, Y_train)\n",
    "part_3 = list(theta_tf.squeeze())\n",
    "try:\n",
    "    part3 = \" \".join(map(repr, part_3))\n",
    "except TypeError:\n",
    "    part3 = repr(part_3)\n",
    "submissions[all_parts[2]]=part3\n",
    "grading.submit(COURSERA_EMAIL, COURSERA_TOKEN, assignment_key,all_parts[:3],all_parts,submissions)\n",
    "theta_tf.squeeze()\n",
    "### GRADED PART (DO NOT EDIT) ###"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class LinRegressNormalEq:\n",
    "    \"\"\"\n",
    "    class LinRegressNormalEq - implements normal equation, maximum likelihood estimator (MLE) solution\n",
    "    \"\"\"\n",
    "    def __init__(self, n_features, learning_rate=0.05, L=0):\n",
    "        import math as m\n",
    "        # input placeholders\n",
    "        self.X = tf.placeholder(tf.float32, [None, n_features], name=\"X\") \n",
    "        self.Y = tf.placeholder(tf.float32, [None, 1], name=\"Y\")\n",
    "    \n",
    "        # regression parameters for the analytical solution using the Normal equation\n",
    "        self.theta_in = tf.placeholder(tf.float32, [n_features+1,None])\n",
    "\n",
    "        # Augmented data matrix is obtained by adding a column of ones to the data matrix\n",
    "        data_plus_bias = tf.concat([tf.ones([tf.shape(self.X)[0], 1]), self.X], axis=1)\n",
    "        \n",
    "        XT = tf.transpose(data_plus_bias)\n",
    "        \n",
    "        #############################################\n",
    "        # The normal equation for Linear Regression\n",
    "        \n",
    "        self.theta = tf.matmul(tf.matmul(\n",
    "            tf.matrix_inverse(tf.matmul(XT, data_plus_bias)), XT), self.Y)\n",
    "        \n",
    "        # mean square error in terms of theta = theta_in\n",
    "        self.lr_mse = tf.reduce_mean(tf.square(\n",
    "            tf.matmul(data_plus_bias, self.theta_in) - self.Y))\n",
    "                       \n",
    "        #############################################\n",
    "        # Estimate the model using the Maximum Likelihood Estimation (MLE)\n",
    "        \n",
    "        # regression parameters for the Maximum Likelihood method\n",
    "        # Note that there are n_features+2 parameters, as one is added for the intercept, \n",
    "        # and another one for the std of noise  \n",
    "        self.weights = tf.Variable(tf.random_normal([n_features+2, 1]))\n",
    "        \n",
    "        # prediction from the model\n",
    "        self.output = tf.matmul(data_plus_bias, self.weights[:-1, :])\n",
    "\n",
    "        gauss = tf.distributions.Normal(loc=0.0, scale=1.0)\n",
    "\n",
    "        # Standard deviation of the Gaussian noise is modelled as a square of the \n",
    "        # last model weight\n",
    "        sigma = 0.0001 + tf.square(self.weights[-1]) \n",
    "        \n",
    "        # though a constant sqrt(2*pi) is not needed to find the best parameters, here we keep it\n",
    "        # to get the value of the log-LL right \n",
    "        pi = tf.constant(m.pi)\n",
    "    \n",
    "        log_LL = tf.log(0.00001 + (1/( tf.sqrt(2*pi)*sigma)) * gauss.prob((self.Y - self.output) / sigma ))  \n",
    "        self.loss = - tf.reduce_mean(log_LL)\n",
    "        \n",
    "        self.train_step = (tf.train.AdamOptimizer(learning_rate).minimize(self.loss), -self.loss)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: run_normal_eq\n",
    "def run_normal_eq(X_train, Y_train, X_test, Y_test, learning_rate=0.05):\n",
    "    \"\"\"\n",
    "    Implements normal equation using tensorflow, trains the model using training data set\n",
    "    Tests the model quality by computing mean square error (MSE) of the test data set\n",
    "    \n",
    "    Arguments:\n",
    "    X_train  - np.array of size (n by k) where n is number of observations \n",
    "                of independent variables and k is number of variables\n",
    "    Y_train - np.array of size (n by 1) where n is the number of observations of dependend variable\n",
    "    \n",
    "    X_test  - np.array of size (n by k) where n is number of observations \n",
    "                of independent variables and k is number of variables\n",
    "    Y_test - np.array of size (n by 1) where n is the number of observations of dependend variable\n",
    "    \n",
    "    \n",
    "    Return a tuple of:\n",
    "        - np.array of size (k+1 by 1) of regression coefficients\n",
    "        - mean square error (MSE) of the test data set\n",
    "        - mean square error (MSE) of the training data set\n",
    "    \"\"\"\n",
    "    # create an instance of the Linear Regression model class  \n",
    "    ndim = X_train.shape[1]\n",
    "    lr_mse_train = 0.\n",
    "    lr_mse_test = 0.\n",
    "    ### START CODE HERE ### (≈ 20 lines of code)\n",
    "    X = tf.placeholder(tf.float32, [None, ndim], name=\"X\")\n",
    "    Y = tf.placeholder(tf.float32, [None, 1], name=\"Y\")\n",
    "    theta_in = tf.placeholder(tf.float32, [ndim + 1, None])\n",
    "    data_plus_bias = tf.concat([tf.ones([tf.shape(X)[0], 1]), X], axis=1)\n",
    "    XT = tf.transpose(data_plus_bias)\n",
    "    \n",
    "    theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, data_plus_bias)), XT), Y)\n",
    "    lr_mse = tf.reduce_mean(tf.square(tf.matmul(data_plus_bias, theta_in) - Y))\n",
    "    \n",
    "    with tf.Session() as sess:\n",
    "        sess.run(tf.global_variables_initializer())\n",
    "        \n",
    "        theta_value = sess.run(theta, feed_dict={X: X_train, Y: Y_train})\n",
    "        lr_mse_train = sess.run(lr_mse, feed_dict={X: X_train, Y: Y_train, theta_in: theta_value})\n",
    "        lr_mse_test = sess.run(lr_mse, feed_dict={X: X_train, Y: Y_train, theta_in: theta_value})\n",
    "    ### END CODE HERE ###\n",
    "    return theta_value, lr_mse_train, lr_mse_test\n",
    "\n",
    "### (DO NOT EDIT) ###\n",
    "theta_value, lr_mse_train, lr_mse_test = run_normal_eq(X_train, Y_train, X_test, Y_test)\n",
    "### (DO NOT EDIT) ###"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Submission successful, please check on the coursera grader page for the status\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "array([ 0.99946201,  0.99579054,  0.49919799,  0.20019798], dtype=float32)"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### GRADED PART (DO NOT EDIT) ###\n",
    "part_4 = list(theta_value.squeeze())\n",
    "try:\n",
    "    part4 = \" \".join(map(repr, part_4))\n",
    "except TypeError:\n",
    "    part4 = repr(part_4)\n",
    "submissions[all_parts[3]]=part4\n",
    "grading.submit(COURSERA_EMAIL, COURSERA_TOKEN, assignment_key,all_parts[:4],all_parts,submissions)\n",
    "theta_value.squeeze()\n",
    "### GRADED PART (DO NOT EDIT) ###"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: run_mle# GRADED \n",
    "def run_mle(X_train, Y_train, X_test, Y_test, learning_rate=0.05, num_iter=5000):\n",
    "    \"\"\"\n",
    "    Maximum likelihood Estimate (MLE)\n",
    "    Tests the model quality by computing mean square error (MSE) of the test data set\n",
    "    \n",
    "    Arguments:\n",
    "    X_train  - np.array of size (n by k) where n is number of observations \n",
    "                of independent variables and k is number of variables\n",
    "    Y_train - np.array of size (n by 1) where n is the number of observations of dependend variable\n",
    "    \n",
    "    X_test  - np.array of size (n by k) where n is number of observations \n",
    "                of independent variables and k is number of variables\n",
    "    Y_test - np.array of size (n by 1) where n is the number of observations of dependend variable\n",
    "    \n",
    "    \n",
    "    Return a tuple of:\n",
    "        - np.array of size (k+1 by 1) of regression coefficients\n",
    "        - mean square error (MSE) of the test data set\n",
    "        - mean square error (MSE) of the training data set\n",
    "    \"\"\"\n",
    "    # create an instance of the Linear Regression model class  \n",
    "    n_features = X_train.shape[1]\n",
    "    model = LinRegressNormalEq(n_features=n_features, learning_rate=learning_rate)\n",
    "    \n",
    "    # train the model\n",
    "    with tf.Session() as sess:\n",
    "        sess.run(tf.global_variables_initializer())\n",
    "\n",
    "        # Now train the MLE parameters \n",
    "        for _ in range(num_iter):\n",
    "            (_ , loss), weights = sess.run((model.train_step, model.weights), feed_dict={\n",
    "                model.X: X_train,\n",
    "                model.Y: Y_train\n",
    "                })\n",
    "\n",
    "        # make test_prediction\n",
    "        Y_test_predicted = sess.run(model.output, feed_dict={model.X: X_test})\n",
    "\n",
    "        # output std sigma is a square of the last weight\n",
    "        std_model = weights[-1]**2 \n",
    "        sess.close()\n",
    "    return weights[0:-1].squeeze(), loss, std_model\n",
    "\n",
    "weights, loss, std_model = run_mle(X_train, Y_train, X_test, Y_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Submission successful, please check on the coursera grader page for the status\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "array([ 0.99974787,  0.99632716,  0.49939191,  0.20030148], dtype=float32)"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### GRADED PART (DO NOT EDIT) ###\n",
    "part_5 = list(weights.squeeze())\n",
    "try:\n",
    "    part5 = \" \".join(map(repr, part_5))\n",
    "except TypeError:\n",
    "    part5 = repr(part_5)\n",
    "submissions[all_parts[4]]=part5\n",
    "grading.submit(COURSERA_EMAIL, COURSERA_TOKEN, assignment_key,all_parts[:5],all_parts,submissions)\n",
    "weights.squeeze()\n",
    "### GRADED PART (DO NOT EDIT) ###"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "celltoolbar": "Edit Metadata",
  "coursera": {
   "course_slug": "guided-tour-machine-learning-finance",
   "graded_item_id": "dX8oQ",
   "launcher_item_id": "7Z9sN"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
